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Abstract 


Animal models of relapse to drug-seeking have borrowed heavily from associative learning approaches. In studies of 


relapse-like behaviour, animals learn to self-administer drugs then receive a period of extinction during which they learn to 
inhibit the operant response. Several triggers can produce a recovery of responding which form the basis of a variety of models. 
These include the passage of time (spontaneous recovery), drug availability (rapid reacquisition), extinction of an alternative 
response (resurgence), context change (renewal), drug priming, stress, and cues (reinstatement). In most cases, the 
behavioural processes driving extinction and recovery in operant drug self-administration studies are similar to those in the 
Pavlovian and behavioural literature, such as context effects. However, reinstatement in addiction studies have several 
differences with Pavlovian reinstatement, which have emerged over several decades, in experimental procedures, associative 


mechanisms, and terminology. Interestingly, in cue-induced reinstatement, drug-paired cues that are present during 
acquisition are omitted during lever extinction. The unextinguished drug-paired cue may limit the model’s translational 
relevance to cue exposure therapy and renders its underlying associative mechanisms ambiguous. We review major behavioural 
theories that explain recovery phenomena, with a particular focus on cue-induced reinstatement because it is a widely used 
model in addiction. We argue that cue-induced reinstatement may be explained by a combination of behavioural processes, 
including reacquisition of conditioned reinforcement and Pavlovian to Instrumental Transfer. While there are important 
differences between addiction studies and the behavioural literature in terminology and procedures, it is clear that 
understanding associative learning processes is essential for studying relapse. 
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Behavioural paradigms in addiction neuroscience often rely 
on classic Pavlovian and operant conditioning processes. While 
theoretical associative learning mechanisms are based on both 
appetitive and aversive conditioning, in addiction neuroscience 
they are applied to model appetitive motivation for drugs. In 
Pavlovian conditioning, animals readily learn to expect the deliv- 
ery of an appetitive or aversive outcome upon the presentation 
of a cue that has consistently been paired with outcome deliv- 
ery. For example, a light (conditioned stimulus, CS) that reli- 
ably predicts the occurrence of a food-reward or foot-shock (un- 
conditioned stimulus, US) can evoke conditioned responding such 


as magazine approach or freezing, respectively, in the absence 
of that outcome. In instrumental or operant conditioning, ani- 
mals learn to perform a response such as a lever press or nose- 
poke to obtain a desired outcome, which is often paired with 
the presentation of a visual and/or auditory cue. Addiction neu- 
roscience takes advantage of both Pavlovian and operant condi- 
tioning through procedures such as conditioned place preference, 
which studies the development of Pavlovian associations between 
experimenter-administered drugs and specific contexts [1], and 
drug self-administration studies, where animals perform an op- 
erant response to obtain drug rewards [2]. These conditioning 
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paradigms enable the study of how animals learn about rewards 
and their associated cues and are often complemented by studies of 
extinction and reinstatement or recovery-from-extinction which 
are designed to study how changes in the relationship between 
the cue and the outcome can alter the behavioural response to 
that cue. Each type of recovery-from-extinction phenomenon has 
its own behavioural processes and theoretical explanations, with 
some of the most complex associative processes occurring during 
cue-induced reinstatement. As recently reviewed by Konova and 
Goldstein, there are a number of parallels between extinction in 
experimental animal models and in humans [3]. Therefore, un- 
derstanding the behavioural and psychological processes mediat- 
ing these forms of learning can provide critical insight to the treat- 
ment and relapse of problem behaviours observed in patients suf- 
fering from psychological disorders that affect appetitive motiva- 
tion, particularly substance use disorders. 


Extinction and its Signature Characteristics 


Extinction is the most basic and reliable method for reducing 
unwanted learned behaviours. In essence, extinction of the be- 
havioural response occurs when the association between the cue 
and the outcome is weakened through the continual presentation 
of the conditioned cue alone without the delivery of the expected 
outcome [4]. Extinction is a fundamental process underlying psy- 
chotherapy. Extinction-based addiction therapies include cue ex- 
posure therapy which is effective for alcohol use disorder [5] and 
virtual exposure therapies which are employed for substance use 
disorders and behavioural addictions [6]. However, cue exposure 
therapies have also frequently been found to be ineffective in clin- 
ical populations [7—9] and cue exposure therapy is not often im- 
plemented [5]. These shortcomings have led many researchers to 
examine why cue exposure therapy is often ineffective and to pro- 
pose various modifications to clinical approaches that might im- 
prove its efficacy [3, 9, 10]. A major clinical limitation of the effi- 
cacy of extinction-based treatments is that extinction is transient 
and the extinguished response can return under a variety of con- 
ditions such as spontaneous recovery, rapid reacquisition, resur- 
gence, renewal, and reinstatement [11-14]. This return of the orig- 
inal behaviour is taken as evidence that extinction does not com- 
pletely erase the original association between the cue and the out- 
come. Rather, it produces a new learning that competes with the 
original memory for behavioural control and is highly dependent 
on environmental cues for its retrieval. 

Behavioural studies have identified several factors which can 
trigger a recovery of responding after extinction. For instance, a 
previously extinguished Pavlovian conditioned response can re- 
emerge simply with the passage of time, known as spontaneous 
recovery [4, 15, 16]. In rapid reacquisition, re-establishing the 
cue-outcome association results ina faster rate of acquisition com- 
pared to initial conditioning [4, see also 17]. Resurgence occurs 
when an operant response is extinguished while an alternative be- 
haviour is reinforced; if the alternative behaviour then undergoes 
extinction the former operant response can recover or resurge [18]. 
In the case of renewal, recovery of the extinguished response can 
be triggered by changing contexts after extinction, where a novel 
context is sufficient to renew responding to that extinguished cue 
and a return to the acquisition context produces the most robust 
recovery effects [19-21]. Finally, in reinstatement paradigms, 
unsignalled presentations of the outcome are sufficient to cause 
a return of responding to an extinguished cue [22]. 

Similar recovery-from-extinction phenomena have been doc- 
umented in drug conditioning and self-administration studies 
[23-26]. Many relapse models in addiction neuroscience are based 
on these classic Pavlovian and operant models, with similar recov- 
ery mechanisms. However, there are some procedural differences 
that distinguish recovery phenomena between the two fields. We 


briefly review the associative basis of spontaneous recovery, reac- 
quisition, and resurgence [for more detail, see 14, 27, 28] in Pavlo- 
vian and instrumental paradigms and argue that the associative 
mechanisms between them are largely comparable. However, 
most addiction neuroscience research is centred on the renewal 
and reinstatement paradigms and so these phenomena are the fo- 
cus of the current paper. While renewal and the drug-primed and 
stress-induced variants of reinstatement are also readily compa- 
rable with Pavlovian recovery phenomena, a comparison between 
experimental approaches reveals that cue-induced reinstatement 
is driven by ambiguous associative processes. We therefore decon- 
struct cue-induced reinstatement at a behavioural level to further 
understand the associative mechanisms driving recovery and ar- 
gue that it may be driven by a combination of (reacquisition of) 
conditioned reinforcement and Pavlovian to Instrumental Trans- 
fer. 


Spontaneous Recovery 


Spontaneous recovery is one of the most basic recovery-from- 
extinction phenomena that provides evidence that the original 
learning survives the extinction procedure. In Pavlov’s original 
studies [4], an extinguished conditioned response spontaneously 
recovered to a level above the minimum achieved at the end of ex- 
tinction after a period of time had elapsed [4, 29-31]. This level 
of recovery in behaviour can vary depending on the length of time 
that intervenes between the extinction and test session such that 
the longer the period between extinction and test, the higher lev- 
els of spontaneous recovery [32]. The spontaneous recovery ef- 
fect has also been observed in instrumental studies by Skinner and 
others as early as the 1930s [29-31]. Spontaneous recovery has 
been shown occasionally in the addiction literature and can occur 
in drug self-administration studies [33-39] and in Pavlovian con- 
ditioning studies with drug reinforcers [26, 40]. 

The current, dominant view of the spontaneous recovery effect 
in both Pavlovian and operant studies is that a shift in the tempo- 
ral context triggers a recovery of responding [41]. Skinner has ar- 
gued that unavoidable cues close to the beginning of the test ses- 
sion such as transport and handling help to promote spontaneous 
recovery [42]. Bouton has since incorporated Skinner’s argument 
as evidence of contextual effects in spontaneous recovery [41]. In 
addition to the temporal context view, Rescorla has also reviewed 
several alternative explanations to describe why spontaneous re- 
covery is observed following extinction of the response [32]. These 
include local performance effects, such as emotional states that 
build up during extinction [43] but which may have dissipated by 
the time of the test, and the effects of response fatigue which re- 
duce responding during the extinction session but dissipate over 
time allowing for the restoration of responding in a subsequent 
session [32]. However, Rescorla notes that for several of these al- 
ternative explanations, empirical support has been mixed or lack- 
ing [32]. 

An alternative view considers spontaneous recovery as a re- 
sult of differential rates of decay between the original associa- 
tion and the inhibitory extinction memory [27]. During acqui- 
sition, excitatory associations between the cue and the outcome 
are learned, while during extinction a separate inhibitory associa- 
tion is learned. Over time, the original association decays slowly, 
while the inhibitory association decays more quickly. A given pe- 
riod of time between the end of extinction and test will therefore 
involve much greater loss of the extinction memory’s inhibitory 
association than decay in the acquisition memory’s excitatory as- 
sociation. Differential decay explains observations that longer pe- 
riods of time between extinction training and test tend to result 
in greater spontaneous recovery, while spontaneous recovery is 
reduced if a delay also occurs between the end of acquisition and 
beginning of extinction training [44]. In the case of an extended 


interval between extinction and test, there has been more time for 
the extinction memory to decay quickly, but the acquisition mem- 
ory is relatively intact. In the case of delayed extinction and a de- 
layed test, although the extinction memory decays prior to test, 
the extended period of time since acquisition gives the original 
memory time to decay as well [27]. 


Another possible explanation for how the passage of time influ- 
ences recovery is due to changes in sensitization and habituation 
that occur over time [45]. Sensitization describes a response that 
is increased (becomes more sensitive) to a repeatedly presented 
stimulus, while habituation describes a response decrement that 
occurs to a stimulus across repetitions. Sensitization and habit- 
uation are separate processes that can occur at the same time in 
response to repeated stimulus presentations. For example, Mc- 
Sweeney and colleagues have argued that sensitization and habit- 
uation both occur within operant sessions, as responding is sen- 
sitized and increases early in the session but then declines later 
in the session as habituation becomes dominant [45]. Moreover, 
aversive Pavlovian conditioning studies have shown that there is 
substantial overlap in the behaviour and neuropharmacology of 
habituation and extinction [46, 47], providing further support for 
the idea that habituation is an inhibitory process distinct from sen- 
sitization. According to this view, spontaneous recovery occurs 
because the habituation decays over time, even when animals re- 
main in the conditioning chamber between sessions, leaving a sen- 
sitized response [48]. 


The idea that sensitization may drive increased responding 
with the passage of time finds empirical support in addiction mod- 
els, such as incubation of craving and psychostimulant sensitiza- 
tion studies. The incubation of craving model was based on clin- 
ical addiction studies where participants reported the subjective 
feeling of craving returning in phases or episodes [49, 50]. In the 
laboratory, animals learn to self-administer drugs and are then 
forced to have a period of abstinence. At test, elevated respond- 
ing is observed based on the length of the abstinence period. Thus, 
despite the fact that incubation of craving does not involve instru- 
mental extinction prior to testing [49, 51-53], this model draws 
parallels with spontaneous recovery by demonstrating that the 
passage of time can increase operant responding. Moreover, like 
spontaneous recovery, incubation of craving has been observed 
with a variety of drug and non-drug reinforcers, including cocaine, 
methamphetamine, and palatable food [49, 51-53]. It is thought 
that sensitization may develop over the course of abstinence to 
drive an increase in responding [49, 54], suggesting that sensiti- 
zation lasts longer than habituation. 


Psychostimulant sensitization studies also support the idea 
of long-lasting sensitization effects that become more apparent 
with the passage of time. In psychostimulant sensitization stud- 
ies, animals receive experimenter-administered drugs (e.g. am- 
phetamine) for several days. At test, they receive a challenge dose 
and sensitization is seen when animals have an elevated response 
to the drug compared to controls. A period of time between the 
end of the experimenter-administered drug exposure phase and 
test is essential to observe sensitization in these experiments [55]. 
It is notable that while spontaneous recovery can be observed in 
minutes with a food reinforcer [48], incubation of craving and psy- 
chostimulant sensitization studies require days to weeks to elapse 
before effects can be observed [49, 55]. However, this timeline is 
consistent with the procedures used to test for spontaneous recov- 
ery after extinction in animals trained to self-administer cocaine 
(33, 34], alcohol [35-37], and nicotine [38, 39]. Taken together, 
evidence from incubation of craving and psychostimulant sensiti- 
zation studies show that the increased responding observed after 
a period of time may be driven by sensitization that persists longer 
than habituation. 
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Rapid Reacquisition 


Rapid reacquisition in a Pavlovian design involves the restoration 
of the cue-outcome association after extinction. Pavlov was the 
first to observe that the fresh application of the US restored con- 
ditioned responding after extinction [4]. Similarly, Skinner stud- 
ied both operant conditioning and reconditioning animals after 
extinction [31]. In both cases, it is understood that the availabil- 
ity of the outcome promotes the recovery effect and demonstrates 
that extinction has not completely erased the initial training mem- 
ory [56]. Classical associative learning models can account for 
rapid reacquisition if the extinguished cue retains some latent as- 
sociative strength that facilitates reacquisition, though some mod- 
els such as Rescorla-Wagner have difficulty explaining the related 
phenomena of slow reacquisition [27]. For Bouton, the conditions 
of rapid reacquisition return the animal to the original training 
context, facilitating retrieval of the original acquisition memory 
[14]. Consistent with the context retrieval account, Skinner ob- 
served that conducting multiple rounds of extinction and recon- 
ditioning produced successively more rapid extinction curves [31]. 
Animals may therefore become more adept at distinguishing be- 
tween the acquisition or self-administration context and the ex- 
tinction context. In either case, drug self-administration studies 
have shown that reacquisition after extinction is more rapid than 
the initial acquisition of the response [57]. 

However, if reacquisition is driven by a context effect, then it 
should be influenced by context manipulations and the evidence 
for this influence is mixed. Bouton and Swartzentruber found that 
context had an effect on reacquisition of conditioned suppression, 
with slower reacquisition in a distinct extinction context [58]. In 
contrast, Willcocks and McNally found overall performance dur- 
ing reacquisition of alcohol self-administration sessions was not 
significantly different when tested in the same context as acquisi- 
tion and extinction, a novel context, or a distinct extinction con- 
text [59]. The only context effect that they observed was altered 
latency to first response in the extinction context during reacqui- 
sition testing [59]. Moreover, Willcocks and McNally have also 
shown distinct neurobiological substrates mediate these effects 
[60]. They found that inactivation of the prelimbic cortex reduced 
contextual renewal but increased responding during reacquisition 
[60]. These neurobiological differences were recently extended to 
the pattern of activation in the mesolimbic dopamine system, such 
as the medial ventral tegmental area, which was required for reac- 
quisition but not renewal [61]. While rapid reacquisition is gener- 
ally an underutilised paradigm in drug self-administration stud- 
ies [62], these neurobiological findings indicate that contextual 
renewal and rapid reacquisition are dissociable, suggesting that 
mechanisms other than renewal contribute to rapid reacquisition. 


Resurgence 


Resurgence is a recovery phenomenon defined by the reappear- 
ance of an extinguished response during the extinction of an alter- 
native response, and is unique to operant paradigms [14, 63, 64]. 
For example, during Phase 1, rats are trained to press a lever for 
food reward. In Phase 2, pressing on the original lever is no longer 
reinforced and pressing on an alternative lever is reinforced. In 
Phase 3, the alternative lever is extinguished which causes re- 
sponding on the original lever to increase. According to Epstein, 
the earliest reports of resurgence effects were by Hull in 1934 [65]. 
Hull trained rats to run down a 20-foot runway, performed ex- 
tinction (then described as “frustration” ), then retrained them to 
run down a 40-foot runway [66]. During extinction with the 40- 
foot runway, rats appeared to run faster in the leadup to the 20- 
foot marker, reminiscent of the original response [66]. Another 
early report of resurgence was a 1951 conference presentation by 
Carey, where the effect was described in rats trained to lever press 
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for food pellets and described as “reinstatement” or “regression” 
[67]. Resurgence was then rediscovered in 1970 by Leitenberg, 
Rawson, and Bath, who found rats would resume operant respond- 
ing during the extinction of an alternative behaviour [68]. How- 
ever, it would not be until Epstein and Skinner began studying the 
effect in the 1980s that it would take on the name of resurgence 
[18, 65]. 

Multiple theories have been proposed to explain resurgence. 
For example, Leitenberg and colleagues originally approached 
resurgence from the perspective of response competition [68], a 
theory that Bouton and colleagues argue is contradicted by evi- 
dence that resurgence is unaffected by the schedule of reinforce- 
ment used for the alternative behaviour [14]. A second explana- 
tion is based on behavioural momentum theory, where reinforce- 
ment of the alternative response increases the disruption to the 
original response but simultaneously strengthens the original re- 
sponse by providing additional reinforcement in the same context 
[69]. However, Shahan, who had previously argued for the appli- 
cation of behavioural momentum theory to resurgence, has more 
recently argued that this theory has encountered difficulty ex- 
plaining results such as the original response becoming more per- 
sistent during extinction even though alternative reinforcement 
was available [70]. Shahan and Craig therefore have proposed the 
“resurgence as choice” model, which argues that animals allocate 
responses based on the values of those responses over time [70]. 
As the alternative response is extinguished, the value of the orig- 
inal response becomes relatively higher and therefore receives an 
increased behavioural allocation [70]. 

In contrast with Shahan, Bouton views resurgence as a context 
effect [14]. Bouton and colleagues have previously suggested that 
the alternative response and its reinforcement represent a distinct 
extinction context for the original response. When the alternative 
response is extinguished, this creates a novel context which re- 
sults in renewal [14]. In response to Shahan and Craig’s adoption 
of the resurgence as choice model, Bouton and colleagues noted 
that contextual elements are now incorporated into resurgence as 
choice [28], suggesting that there is growing theoretical conver- 
gence in the behavioural understanding of resurgence. 

Interest in resurgence as an addiction model arises from its 
potential relevance to contingency management and related ap- 
proaches which incentivise abstinence from substance use by pro- 
viding non-pharmacological rewards, in other words, by reinforc- 
ing alternative behaviours (71, 72]. The resurgence effect suggests 
that there is potential for relapse when therapy concludes and in- 
centives are withdrawn [14] which is known to occur [73]. Ani- 
mal studies have shown that resurgence occurs when drugs such 
as alcohol or cocaine are used as reinforcers [74—76]. One poten- 
tial point of contention is that Bouton and colleagues have concep- 
tualised resurgence as renewal due to a novel context (‘C’) that is 
distinct from both the acquisition context (‘A’) and the extinction 
context (‘B’), also known as ABC renewal [14]. However, attempts 
to show ABC renewal in alcohol have not been successful in either 
an operant design [77] or Pavlovian design [78]. It is therefore 
possible that resurgence in drug self-administration studies may 
be driven by factors other than contexts, underscoring the impor- 
tance of understanding the behavioural mechanisms which drive 
resurgence as an avenue for further research. 


Renewal 


Contextual theories of recovery phenomena have been extraordi- 
narily influential. In 1978, Welker and McAuley first demonstrated 
that contextual stimuli could produce a recovery of operant re- 
sponding for food [79]. In 1979, Bouton and Bolles published two 
papers on contextual effects on the recovery of extinguished con- 
ditioned suppression [21, 80]. In one paper, they showed that rats 
that received conditioning in one context and extinction in a dis- 


tinct context, recovered or renewed conditioned suppression when 
they were returned to the original training context [21]. Inanother 
study, they showed that when the US (foot-shocks) was delivered 
in the conditioning chambers the day after extinction but before 
test, this only produced reinstatement when the foot-shocks were 
delivered in the same context as testing [80]. These early papers 
showed the importance of contexts in extinction learning and re- 
covery phenomena and would form the basis of a decades-long re- 
search programme that has since extended the renewal model to 
include both Pavlovian and operant conditioning, multiple contex- 
tual manipulations, and an increasingly broad definition of con- 
texts [14, 17, 19, 21, 28, 41]. 

The definition of context has now expanded beyond environ- 
ments that can be distinguished by various stimuli. As alluded 
to in sections on spontaneous recovery, reacquisition, and resur- 
gence, Bouton also views the presence or absence of manipulanda 
and reinforcement as contextual factors, as well as the passage 
of time [14]. In addition to the spatial contexts defined by en- 
vironmental stimuli, there are also temporal, interoceptive (e.g. 
hormonal or physiological states), cognitive, and social or cul- 
tural contexts [19, 28]. Contextual renewal processes are there- 
fore thought to be common to virtually all forms of relapse-like 
behaviour [14, 28, 41]. 

In classical renewal models, contexts are defined by environ- 
mental visual, olfactory, and tactile cues. These cues are used to 
present animals with distinct contexts during acquisition, extinc- 
tion, and renewal sessions. In the classic ABA design, a Pavlovian 
or operant response is acquired in one context (‘A’), followed by ex- 
tinction in a second context (‘B’), and then animals are returned 
to the original training context (‘A’) for renewal [21]. It has since 
been shown that renewal can also be triggered simply by removing 
animals from the extinction context, whether extinction occurs in 
the same context as acquisition (AAB renewal) or whether extinc- 
tion occurs in its own distinct context and testing for renewal oc- 
curs in a novel context (ABC renewal) [14]. Experiments by Bouton 
and King showed that contexts tend not to accrue inhibitory as- 
sociative value during extinction [81], in contrast with the predic- 
tions of the Rescorla-Wagner model [82]. Bouton and colleagues 
have therefore maintained, since the 1980s, that contexts have 
an occasion setting mechanism which retrieves specific associa- 
tions [14, 28, 83]. In other words, acquisition and extinction cre- 
ate separate associations with a specific Pavlovian cue or operant 
response and context determines which of these associations will 
be retrieved, driving renewal in various contexts [14, 28, 83]. 

In addiction models, interest centres primarily on the ABA re- 
newal paradigm [28, 62]. ABA renewal has been observed in self- 
administration studies with a variety of drugs, such as alcohol 
[59, 60, 77, 84], cocaine [85, 86], cocaine-heroin mixtures [87], 
heroin [88], and nicotine [89]. However, other forms of renewal 
such as AAB and ABC renewal are not so readily observed in studies 
using drug reinforcers. For example, Zironi and colleagues tested 
rats trained to self-administer alcohol in a novel context (ABC re- 
newal) and did not observe significant recovery while rats trained 
to self-administer sucrose did show ABC renewal [77]. Similar re- 
sults have been seen in an appetitive Pavlovian design, where ABA 
renewal was observed with both alcohol and sucrose reinforcers 
[78]. However, neither AAB nor ABC renewal was observed with 
either reinforcer [78]. Similarly, Crombag and colleagues did not 
observe AAB renewal following self-administration of a cocaine- 
heroin mixture [87] and Fuchs and colleagues observed ABA but 
not AAB renewal of cocaine self-administration [90]. Bossert 
and colleagues also did not observe AAB renewal of heroin self- 
administration [88], nor was AAB renewal for nicotine observed by 
Diergaarde and colleagues [89]. Prior studies that have shown AAB 
or ABC renewal in operant designs have all used food pellets or su- 
crose as reinforcers ['78, 91-93]. In instrumental studies with food 
reinforcers, manipulations such as conducting acquisition train- 
ing in multiple contexts have been shown to enhance ABC renewal 


(94], indicating the potential for improved generalisation of the 
acquisition memory to promote ABC renewal in addiction studies. 
However, this possibility has not yet been tested, so it remains un- 
clear why these other forms of renewal are not easily observed us- 
ing drug reinforcers. Nonetheless, the contextual renewal effect 
appears to be sensitive to the identity of the reinforcer and this 
may be important for its applications in addiction neuroscience 
and clinical practice. 

The addiction literature also refers to renewal designs as 
“context-induced reinstatement”. While Bouton and colleagues 
have consistently used the term renewal to refer to recovery due 
to context change since 1979 [14, 21, 28, 41, 81], there is some 
historical precedent for this alternative nomenclature. In 1978, 
Welker and McAuley reported that when contextual and transport 
cues were “reinstated” after extinction, that operant responding 
for food returned [79]. Although their work is cited as an ex- 
ample of renewal [41], they were actually interested in extinc- 
tion and spontaneous recovery and do not use the term renewal 
(79]. In fact, Welker and McAuley refer to their contextual ma- 
nipulations as “reinstating responding during the final session of 
extinction” [79]. By the early 2000s the term renewal had be- 
come well-defined and widespread in both the Pavlovian and op- 
erant literature [14, 41, 95, 96], including studies involving drug 
self-administration [25, 85, 87, 89]. However, addiction neuro- 
scientists also began to refer to renewal designs as reinstatement 
around the same time [88, 97-99], with some authors transition- 
ing from using renewal to reinstatement [87, 100, 101] and oth- 
ers applying both terms interchangeably [84, 86, 89, 102, 103]. As 
will be discussed further below, reinstatement in the addiction lit- 
erature has become an umbrella term that covers multiple recov- 
ery phenomena beyond the classical design of Pavlovian reinstate- 
ment studies. These differing recovery phenomena have diverse 
associative mechanisms due to differences between Pavlovian and 
instrumental learning, as well as the various processes used to pre- 
cipitate recovery-from-extinction. 


Reinstatement 


In Pavlovian experiments, reinstatement of responding to an ex- 
tinguished cue is typically observed following re-exposure to the 
aversive or appetitive event, often unsignalled and usually prior to 
testing [22]. For instance, studies of extinction following Pavlo- 
vian fear conditioning show that presentations of the foot-shock 
can reinstate fear responding [17, 21, 22, 41, 56, 104, 105] as well 
as other aversive triggers that induce the state of fear such as ex- 
posure to stressors (e.g. a milder foot-shock than that used in con- 
ditioning; [105]) or to a conditioned context [106, 107]. This phe- 
nomenon has also been demonstrated in animal studies of reward 
learning [108], as well as in human studies [109-112]. The term 
reinstatement is also used to describe recovery that occurs when 
the CS is tested with other stimuli that have been separately con- 
ditioned with the US [41], although definitions of reinstatement 
do not always recognise this usage [28]. For example, Halladay 
and colleagues found that presentations of an equally aversive, un- 
extinguished CS after extinction training reinstated freezing to a 
different, extinguished CS on test [113]. Moreover, this reinstate- 
ment of responding induced in the absence of the US is evident in 
human research [114]. 

Reinstatement is thought to depend on the restoration or re- 
trieval of the association between the CS and US. A typical Pavlo- 
vian design is shown in Figure 1a-c. During acquisition, an asso- 
ciation between the CS and US is acquired (Figure 1a). This asso- 
ciation is weakened throughout extinction (Figure 1b). The pre- 
sentation of the US, which usually occurs the day prior to the re- 
instatement test [41], then produces a restoration of the CS-US 
association (Figure 1c). Since Pavlov, it has been thought that ex- 
tinction involves an inhibitory response that suppresses the condi- 
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tioned response [4, 41]. Even researchers who have argued that ex- 
tinction may involve (partial) erasure do not argue against the sur- 
vival of the original association and new learning of an inhibitory 
response [56]. According to one view, the presentation of the US 
is thought to reactivate the original association with the CS and 
thus lead to a restoration of conditioned responding [41]. Alterna- 
tively, Bouton and colleagues argue that reinstatement depends on 
the context being associated with the US [28], because if it is pre- 
sented in a different context, then reinstatement does not occur 
[80]. Presentations of the US can also strengthen the CS-US asso- 
ciation, despite the absence of the CS, via mediated conditioning 
[104]. The prior presentations of the US may therefore restore its 
association with the context, which then enables retrieval of the 
CS-US association during test. 


Historical Use of the Term “Reinstatement” in 
Addiction Studies 


In operant drug self-administration studies, what is described as 
reinstatement differs from the stricter definition used in the Pavlo- 
vian and behavioural literature [28, 41]. The use of the term “rein- 
statement” in a manner distinct from but related to the definition 
used in Pavlovian conditioning began to emerge in the addiction 
neuroscience literature in the 1970s and early 1980s [115]. These 
early studies all used the term reinstatement to describe the return 
of responding that was observed as a result of re-exposure to drugs 
or drug-associated cues. As early as 1971, Stretch and colleagues 
reported that instrumental responding for amphetamine could be 
reinstated by injections of amphetamine which they theorised was 
caused by “reinstatement of the drug state” [116]. In this 1971 pa- 
per, and in two subsequent reports, they also use reinstatement to 
refer to the restoration of response rates that occurred due to drug 
injections [116-118]. In 1976, Davis and Smith described training 
a neutral stimulus as a conditioned reinforcer by pairing it with 
intravenously self-administered morphine [119]. After extinction 
of the instrumental response, the conditioned reinforcer was de- 
scribed as causing “reinstatement” or “restoration” of the instru- 
mental response [119]. The experimental approach of Davis and 
Smith is now the basis of the cue-induced reinstatement model in 
widespread contemporary use. 

In the early 1980s, de Wit and Stewart reported “reinstate- 
ment” of operant responding for cocaine and heroin following 
injection of various drugs or presentation of a cue that had pre- 
viously been paired with drug delivery [120-123]. These semi- 
nal papers are credited with establishing the reinstatement model 
which has been used, in various forms, by addiction neuroscien- 
tists ever since [115]. Numerous other studies have now shown 
that presentations of a food or drug prime can also reinstate oper- 
ant responding following extinction of the instrumental response 
[23, 24, 116, 119, 124-129]. Moreover, in addition to the drug- 
primed and cue-induced reinstatement models already developed, 
subsequent studies also showed that reinstatement could be pre- 
cipitated in animals by stressors such as foot-shocks [130, 131] or 
by combining multiple precipitating factors, for example, by using 
both cue presentation and drug priming [132]. As discussed above, 
renewal is now also described as context-induced reinstatement 
in the addiction neuroscience literature [100]. 

The term “reinstatement” has also been used to describe mod- 
els of relapse after abstinence. Abstinence models may involve 
forced abstinence, where animals are not given access to drugs 
such as in the incubation of craving model, or abstinence that 
is technically self-imposed due to punishment or the availability 
of a more desirable alternative [133-135]. For example, Panlilio 
and colleagues suppressed operant responding for opioids by pun- 
ishment with foot-shocks and then “reinstated” responding us- 
ing drug-priming [136]. Other studies using punishment-induced 
abstinence have also referred to relapse-like processes as “re- 
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Figure 1. Associative learning in Pavlovian and operant drug-primed and stress-induced reinstatement paradigms. (a) In Pavlovian conditioning, acquisition occurs by 
repeated pairings of a previously neutral conditioned stimulus (CS) with an unconditioned stimulus (US). (b) During extinction, the CS is presented in the absence of the 
US. (c) Prior to reinstatement, animals are exposed to the US. During test, the performance of the conditioned response is increased. (d) During operant conditioning, a 


response (R), such as a lever press, is paired with discrete stimuli (S) and a reinforcer which has both a perceptual identity (0!) and incentive value (O’). 


(e) In extinction, 


only the drug (ol /O”) is withheld, resulting in extinction of the R-O! /O’ and s-o! /O” associations. (f) Immediately prior to reinstatement testing, rats are administered drug 
to prime reinstatement or are exposed to a stressor. In either case, reinstatement must rely on the reactivation of the R-O'/O” associations because the S-0' /O” association 
was previously extinguished. Note that in some drug-primed or stress-induced reinstatement paradigms, the cue (S) is omitted, though this does not alter the importance 


of the R-O!/o” associations. 


instatement” [137]. However, drug-seeking after abstinence is 
not a recovery-from-extinction phenomenon. Rather, other au- 
thors have tended to refer to such relapse-like processes as in- 
cubation of craving [138] or simply as relapse [133, 139, 140]. 
While an in-depth discussion of the associative and behavioural 
processes underlying relapse-after-abstinence models is beyond 
the scope of the present paper it seems possible that punish- 
ment and relapse models involve reacquisition, as suggested by 
Panlilio and colleagues [136], or contextual or occasion-setting ef- 
fects that reactivate the acquisition memory. Marchant and col- 
leagues have also argued that punishment effects are dominated 
by response-outcome associations that, like extinction, produce 
context-dependent suppression of responding [141]. Where ab- 
stinence produced by the availability of a desirable alternative, 
response competition is the most obvious possibility and there 
have even been studies that show, at least for cocaine, that drugs 
can suppress the response for the appetitive non-drug alternative 
[142]. 


Behavioural Processes in Drug-Primed Rein- 
statement 


Drug-primed reinstatement is consistent with the classical Pavlo- 
vian definition of reinstatement, due to its precipitation by pre- 
sentation of the drug outcome (Figure 1d-f). Associative learning 
models posit that during acquisition the operant response (R) be- 
comes associated with both the drug-paired cues (S) and with the 
drug outcome (0), which has both a perceptual identity (0!) and 
incentive value (O”; Figure 1d) [143]. The result is associations be- 
tween R, Sand 0!/O’. In drug-primed reinstatement protocols, an- 
imals may be trained without discrete drug-paired cues [123] or ex- 
tinction can involve presentation of the cue, omitting only drug de- 


livery [130]. For example, extinction procedures for drug-primed 
reinstatement may be designed to merely withhold drug delivery 
by substituting drug for saline or disconnecting the syringe pump 
(123, 130], leaving any cues paired with infusions in place. Since 
extinction is otherwise identical to self-administration training, it 
is clear in these paradigms that the association between the oper- 
ant response and drug delivery is extinguished (R-O'/O"; Figure 
1e). Moreover, if cues are present, then their association with the 
drug outcome (S-0!/0") is also be extinguished. Drug-primed re- 
instatement must therefore rely on the response-drug outcome 
(R-0!/0") association, which follows the classical definition of re- 
instatement as occurring in response to the US [17, 22, 41, 56]. As 
with Pavlovian reinstatement, theoretical accounts largely differ 
with respect to whether the response-drug outcome association 
is reactivated by contextual associations or whether its associative 
value is restored. 


Bouton and colleagues argue that reinstatement in Pavlovian 
and fear conditioning designs is a context effect [14, 28] and their 
reasoning clearly applies to drug-primed reinstatement. Since re- 
instatement is context-dependent in Pavlovian fear conditioning 
[80], Bouton and colleagues argue that reinstatement relies on 
the animal expecting the US in that context [14, 28]. In Pavlo- 
vian designs, this expectation is restored by presenting the US 
prior to test. In addiction studies, the drug-priming injection 
serves the same purpose. The subjective effects of the drug pro- 
duce an interoceptive context and these can influence extinction. 
For example, alcohol has previously been shown to result in state- 
dependent learning [144]. Citing these and other findings showing 
that drug-induced interoceptive states can influence behaviour, 
Bouton and colleagues argue that drugs produce interoceptive con- 
texts [28]. Drug-primed reinstatement is therefore a function of 
an interoceptive version of ABA renewal, where the priming injec- 
tion returns the animal to the acquisition context, retrieving the 


response-drug outcome association which produces recovery of 
responding. 

Another explanation for drug-primed reinstatement can be 
drawn from the Rescorla-Wagner model, according to Delamater 
and Westbrook [56]. Delamater and Westbrook argue that the 
Rescorla-Wagner model [82] predicts reinstatement because the 
US presentations restore the associative strength of the previously 
extinguished stimulus. Since the design of drug-primed reinstate- 
ment is analogous to Pavlovian extinction and reinstatement, it 
could also be argued that drug-priming restores the associative 
strength between the response and the drug outcome (R-0!/O’), 
driving a recovery of responding. In other words, both context 
theory and the Rescorla-Wagner model rely on the response-drug 
outcome association, but differ with respect to whether this asso- 
ciation is reactivated by a drug-induced interoceptive context or 
restored by drug priming. 


Behavioural Processes in Stress-Induced Rein- 
statement 


Similar associative mechanisms may be involved in stress- 
induced reinstatement. Stress-induced reinstatement experi- 
ments are conducted in a manner that is essentially identical 
to drug-primed reinstatement, except reinstatement is triggered 
by presentation of a stressor (Figure 1d-f). Stress-induced re- 
instatement paradigms can use a wide variety of stressors, such 
as acute food deprivation, foot-shock, and pharmacological stres- 
sors such as the anxiogenic drug yohimbine [131]. Stress-induced 
reinstatement has also been observed for a variety of drugs, in- 
cluding heroin, cocaine, methamphetamine, nicotine, and alcohol 
(131]. While most studies of food-seeking have found that stres- 
sors did not induce reinstatement of food-seeking [131], it has 
been shown to be possible under certain conditions, such as when 
rats receive daily exposure to the calorie-dense cafeteria diet [145]. 
Just as for drug-primed reinstatement, extinction procedures for 
stress-induced reinstatement can merely withhold drug delivery 
by substituting drug for saline or disconnecting the syringe pump 
({130, 146]. Some stress-induced reinstatement designs present 
the cue during reinstatement [147], while others have also omit- 
ted drug-paired cues during reinstatement testing, despite their 
presence during acquisition and extinction [146]. Stress-induced 
reinstatement can also be paired with early life stress, such as post- 
weaning social isolation, though this did not alter reinstatement 
[148]. This procedural flexibility demonstrates that cue-drug out- 
come associations (S-0!/O”) are not necessary for stress-induced 
reinstatement. Therefore, as with drug-primed reinstatement, 
the stressor must act to restore or reactivate the response-drug 
outcome association (R-O!/O”) to cause recovery of responding. 

Bouton and colleagues have applied context theory specifically 
to stressors and argue that stressors are most likely to promote re- 
covery of responding if they have also been paired with acquisition 
(28, 149, 150]. Schepers and Bouton have conducted a series of ex- 
periments using hunger and a chronic variable stress protocol to 
produce interoceptive contexts [149, 150]. When hunger or stress 
was associated with acquisition, but not extinction, then renewal 
was observed when these conditions were restored prior to test 
(149, 150]. Bouton and colleagues argue that these findings show 
that stress produces an interoceptive context [28]. Therefore, as 
with drug-primed reinstatement, stress-induced reinstatement 
removes animals from the extinction context which enables re- 
trieval of the response-drug outcome association and recovery of 
responding. 

However, there are some procedural differences between 
Schepers and Bouton’s studies and the stress-induced reinstate- 
ment paradigm that suggest alternative explanations. In Schepers 
and Bouton’s studies, hunger and stress were present during both 
acquisition and test [149, 150]. In contrast, typical stress-induced 
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reinstatement designs do not introduce the stressor until after ex- 
tinction [131]. For example, the first stress-induced reinstatement 
papers used 10 min of intermittent foot-shock in the test context 
prior to the start of the session [146, 147]. Therefore, the design 
of stress-induced reinstatement studies does not follow the ABA 
renewal design used by Schepers and Bouton, but more closely re- 
sembles an interoceptive ABC renewal design because acquisition, 
extinction, and reinstatement are each associated with their own 
interoceptive states produced, respectively, by the drug’s subjec- 
tive effects, the absence of drug, and stress. Moreover, Schepers 
and Bouton used food as a reinforcer [149, 150], while most stud- 
ies of stress-induced reinstatement have shown that food-seeking 
is not reinstated by stress [131]. These procedural differences raise 
the possibility that stress-induced reinstatement is driven by fac- 
tors others than interoceptive contexts. 

One possibility is that stress-induced reinstatement functions 
through largely non-associative affective mechanisms as animals 
attempt to relieve their negative affective state via drug-seeking. 
Drug addiction has previously been theorised to involve processes 
of negative reinforcement, where drug use alleviates aversive 
states [151-153]. If this were true, then it would imply that the 
recovery of responding observed during stress-induced reinstate- 
ment is goal-directed. As Trask and colleagues have argued, Pavlo- 
vian and operant extinction and recovery phenomena share many 
features and common processes, but goal-directed vs habitual ac- 
tions are unique to operant behaviours [154]. If stress-induced 
reinstatement is directed towards alleviating aversive states pro- 
duced by the stressor, then the recovery of responding relies on the 
association between the operant response and the affective value of 
the drug outcome. Therefore, stress-induced reinstatement may 
be sensitive to outcome devaluation manipulations and this possi- 
bility invites empirical verification. 


Behavioural Processes in Cue-Induced Reinstatement 


Cue-induced reinstatement is driven by ambiguous behavioural 
and associative processes. Cue-induced reinstatement is a com- 
mon relapse model that is driven by the presentation of an un- 
extinguished drug-paired cue [98, 155, 156]. During acquisition, 
animals first learn an operant response (e.g. lever press) for food- 
or drug-reward paired with a light or tone cue (Figure 2a). The 
operant response is then extinguished such that lever presses no 
longer results in outcome delivery or presentations of the cue (Fig- 
ure 2b) [103, 157-159]. This extinguishes both the response-cue 
(R-S) and response-drug (R-O) associations, but leaves the cue- 
drug (S-O) associations intact (Figure 2b). Unlike other Pavlovian 
reinstatement paradigms, reinstatement of responding in cue- 
induced reinstatement is assessed using response-contingent pre- 
sentations of the food- or drug-paired cue (Figure 2c). In other 
words, the animal makes a response for the non-extinguished 
reward-associated cue. Now, given that reward-paired cues can 
elicit conditioned responding in and of themselves (see [160]), 
dissociating the mechanism mediating the return in responding 
when the cue is presented as the outcome for the extinguished re- 
sponse becomes challenging. It has also been noted that the us- 
age of the term reinstatement in cue-induced reinstatement is not 
consistent with the definition used in Pavlovian conditioning and 
most of the behavioural literature [28]. 

The associative processes underlying cue-induced reinstate- 
ment are important from both theoretical and practical or transla- 
tional perspectives. Procedural differences between cue-induced 
reinstatement and the therapeutic approaches it is supposed to 
model, reduce its translational potential. Specifically, the lack 
of extinction of the S-O association in the cue-induced reinstate- 
ment model does not properly model the cue-exposure therapy 
paradigms in humans [161]. In a clinical context, cue-exposure 
therapy involves repeated presentation of drug-associated cues, 
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Figure 2. Associative processes in cue-induced reinstatement are ambiguous. (a) During operant conditioning, a response (R), such as a lever press, is paired with discrete 
stimuli (S) and a reinforcer which has both a perceptual identity (0!) and incentive value (O”). (b) In extinction, both the cues and drug (O'/O”) are withheld, resulting in 
extinction of the R-S and R-O! /O” associations. (c) During cue-induced reinstatement, the cue is available again (in some procedures, it is presented non-contingently at 
the beginning of the session), but it is unclear whether reinstatement occurs via reactivation of the R-S, R-O!, or R-O” associations. In the absence of drug, reinstatement 


also serves as extinction for S-O! and S-O. 


which would be more appropriately modelled by Pavlovian extinc- 
tion of drug-paired cues rather than instrumental extinction. In 

the cue-induced reinstatement paradigm, it is only the instrumen- 
tal response that is extinguished, so cue-induced reinstatement is 

not an ideal model for cue-exposure therapy and relapse. 


The ambiguity regarding the associative processes which drive 
cue-induced reinstatement also limit its contribution to theory. 
If the instrumental response is extinguished alone, it is unclear 
what is causing the restoration of responding observed during re- 
instatement. As discussed above, in Pavlovian conditioning it is 
the restoration or reactivation of the CS-US association that drives 
reinstatement. However, in operant cue-induced reinstatement 
models, the analogous S-O association was never extinguished. 
With respect to associative learning, this leaves three possibilities 
— the R-S, S-O!, and S-O associations. Experimental evidence 
suggests that the R-S association alone is not the driver of cue- 
induced reinstatement, for reasons that will be discussed below, 
but there remains some ambiguity regarding the role of S-O! or S- 
O” associations. 


Cue-Induced Reinstatement Relies on Cue-Outcome Associations 


A small number of studies have demonstrated that cue-induced 
reinstatement relies on cue-drug outcome (S-O) associations 
because separate Pavlovian conditioning or extinction of drug- 
paired cues can alter later cue-induced reinstatement. One ex- 
ample is the Pavlovian cue-conditioned reinstatement approach, 
which demonstrates that a Pavlovian conditioned cue can promote 
reinstatement [162, 163]. As shown in Figure 3, rats are trained to 
self-administer cocaine without cues and given a single Pavlovian 
conditioning session in the middle of self-administration training. 
These Pavlovian conditioned cues can later precipitate reinstate- 
ment after instrumental extinction when they are presented con- 
tingently [162, 163]. Since the operant response and drug-paired 
cue were never combined, this design demonstrates that reinstate- 
ment relies on the Pavlovian associations between the cue and the 
drug outcome. 


Studies that combined Pavlovian non-contingent extinction 
with instrumental extinction have further shown the importance 
of the S-O associations and further demonstrated context effects. 
In these designs, rats are trained to self-administer in the pres- 
ence of cues (Figure 4a), before receiving two separate kinds of ex- 
tinction (Figure 4b). Instrumental extinction follows standard pro- 
cedures, omitting both cue and drug. However, additional Pavlo- 
vian extinction sessions present the cue alone in a non-contingent 
manner, extinguishing the S-O associations. At test, reinstate- 
ment is diminished because all of the associations have been ex- 


tinguished (Figure 4c), but these demonstrate that the Pavlovian 
S-O association is important because the reinstatement test was 
not simply identical to the instrumental extinction sessions as it 
was in previous studies [164]. Buffalari and colleagues also con- 
ducted extinction with the cues present, which leaves R-S intact, 
or Pavlovian extinction of the S-O association alone, leaving R-O 
intact [164]. They found, unsurprisingly, that rats extinguished 
with cues present showed the least reinstatement while rats that 
received Pavlovian extinction of the cue alone had the highest level 
of reinstatement [164]. These effects may also be context depen- 
dent. Torregrossa and colleagues gave rats instrumental extinc- 
tion and then a phase of non-contingent cue extinction in either 
the training context (A) or a distinct extinction context (B). They 
found that when non-contingent extinction was given in context 
A, this produced the lowest levels of cue-induced reinstatement 
[165]. Non-contingent cue extinction in context B was not effec- 
tive unless combined with d-cycloserine treatment [165]. 


Separate studies using the same approach in a single-context 
paradigm have replicated these results. In a study by Perry and 
colleagues, rats were trained to self-administer cocaine followed 
by standard instrumental extinction [166]. Rats that subsequently 
received Pavlovian non-contingent cue extinction showed reduced 
cue-induced reinstatement relative to controls [166]. Follow-up 
studies from the same group have replicated these findings in 
adult, but not adolescent rats undergoing cue-induced reinstate- 
ment [167], and shown that non-contingent cue extinction can 
effectively abolish incubation of craving [168]. Together, these 
findings seem to indicate that it is learned associations between 
the drug-paired cue and the drug outcome (S-O) that drive cue- 
induced reinstatement. However, while these studies clearly 
demonstrate that the S-O associations are important, they don’t 
provide evidence about whether it is S-O! or general affective S-OY 
associations that drive reinstatement. 


Unfortunately, there is no simple solution to this ambiguity be- 
cause cue-induced reinstatement designs require there to be dis- 
crete drug-paired cues during self-administration training and for 
those cues to be omitted during extinction. Unlike other recov- 
ery procedures, such as contextual renewal studies where drug- 
paired cues can be present [102, 169] or absent [103] during ex- 
tinction, cue-induced reinstatement has no alternative trigger for 
recovery. If drug-paired cues are retained during extinction, then 
the response rate will decline and the cue-induced reinstatement 
test will simply be identical to another extinction session. Con- 
temporary reinstatement designs do not reinforce responses dur- 
ing test [103, 157-159], nor is this a possible solution because oth- 
erwise they would be rapid reacquisition experiments [41, 169]. 
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Figure 3. A Pavlovian cue, conditioned separately to the self-administration, can trigger reinstatement. (a) During instrumental training, animals learn to associate the 
operant response with the outcome. They also receive a single, separate, Pavlovian conditioning session which pairs a cue with the same outcome. (b) Animals then receive 
standard extinction for the operant response. (c) The Pavlovian conditioned cue is able to promote reinstatement without ever having been associated with the operant 


response. 


In order to produce a return of responding, there must be some 
kind of precipitating factor. In Pavlovian designs, this is achieved 
with the US, but for cue-induced reinstatement, this has to be the 
never-extinguished drug-paired cues. Current views centre on 
the idea that it is a form of conditioned reinforcement [28], how- 
ever, we argue that cue-induced reinstatement is more ambiguous 
and complex than this. There are procedural and empirical reasons 
to believe that cue-induced reinstatement may be better concep- 
tualised as reacquisition of conditioned reinforcement or, alterna- 
tively, that it may involve Pavlovian to Instrumental Transfer [28]. 


Reinstatement as Reacquisition of Conditioned Reinforcement 
Conditioned reinforcers are previously neutral stimuli that have 
become reinforcers through repeated pairings with a primary rein- 
forcer [170-172]. In some cases, the definition is operationalised 
by the requirement that they can support and maintain new oper- 
ant responses [173, 174]. The absence of extinction for the drug- 
paired cues’ S-O association suggests that the reinstatement ef- 
fect observed during cue-induced reinstatement might be better 
classified as conditioned reinforcement. This is analogous with 
the idea proposed by Davis and Smith in 1976 that cues can pro- 
mote reinstatement [119]. Moreover, some researchers have gone 
as far as referring to cue-induced reinstatement as an alias for con- 
ditioned reinforcement [175]. Bouton and colleagues have also re- 
cently considered cue-induced reinstatement and argue that it is 
driven by conditioned reinforcement, rendering it distinct from 
drug-primed or stress-induced reinstatement [28]. Conditioned 
reinforcement involves animals responding for the cue and the 
classical Pavlovian view of conditioned reinforcement is that the 
cue itself acquires conditioned value [170, 171]. Conditioned rein- 
forcement is also one of the key phenomena cited in support of the 
incentive sensitization theory of addiction [176]. Consistent with 
the classical Pavlovian view, incentive sensitization theory posits 
that repeated pairings between the cue and the drug results in 
some of the drug’s incentive motivational properties being trans- 
ferred to the cue. This incentive motivational transfer is thought 
to be observable in sign-tracking behaviour, where animals ap- 
proach and attempt to interact with appetitive cues [176]. 


Evidence for Conditioned Reinforcement in Cue-Induced Reinstate- 
ment 

Several studies have shown that discrete cues paired with drug 
delivery acquire conditioned reinforcing properties during self- 
administration as animals will respond for the presentation of 
these cues alone in later tests [174, 177-180]. These effects may be 
particularly pronounced for nicotine because nicotine-paired cues 
alone have been shown to maintain responding for several days af- 
ter a prolonged 40-day self-administration phase [181]. Evenif cue 


omission during extinction results in the extinction of the R-S as- 
sociation, reinstatement is explained by the conditioned reinforc- 
ing properties of the drug-paired cues, which have acquired their 
own incentive value. It would therefore not be the R-O association 
that is reactivated during reinstatement, but an R-S-O association 
that drives responding. 

However, if cue-induced reinstatement really is driven by con- 
ditioned reinforcement, as both Kawa and colleagues and Bouton 
and colleagues have suggested (28, 175], then it may actually bea 
form of reacquisition. In their study, Kawa and colleagues trained 
rats to nosepoke for cocaine. Each cocaine delivery was simultane- 
ously paired with presentation of a cue light. Following standard 
protocols, nosepokes made during extinction had no programmed 
consequences, but during their reinstatement test nosepokes re- 
sulted in cue presentation but not drug delivery [175]. However, 
the R-S association should have been extinguished by cue omis- 
sion during the extinction phase. When the CS and US are paired 
again after extinction in Pavlovian designs, this is referred to as 
reacquisition [41]. If the cue is a conditioned reinforcer and is 
paired with the response again after extinction, then this design 
more closely matches reacquisition of conditioned reinforcement 
than simply conditioned reinforcement. 

Even if cue-induced reinstatement is driven by conditioned re- 
inforcement or its reacquisition, this does not necessarily clarify 
the mechanism by which the cue elicits responding. For example, 
is the elevation in responding during cue-induced reinstatement 
because the cue is reinforcing in itself or does cue presentation pro- 
duce an excitatory signal that stimulates further responding? Sha- 
han has argued that conditioned reinforcement occurs because the 
cue acts as a sign-post towards the physiologically-relevant rein- 
forcer [182]. According to this view, animals respond for predictive 
stimuli because of their temporal relationship with the reinforcer 
[182-184]. Along with the classical Pavlovian account of condi- 
tioned reinforcement as acquiring conditioned value [170, 171], 
this would imply that conditioned reinforcement is driven by the 
S-O association. Parkinson and colleagues found that a sucrose- 
paired Pavlovian conditioned reinforcer is not sensitive to outcome 
devaluation, which they suggest may be because conditioned rein- 
forcers activate a central appetitive motivational state or can be- 
come a goal in their own right [185]. Further studies are required 
to assess whether these findings are relevant to cue-induced rein- 
statement for drugs, for example by conducting devaluation of the 
cue or outcome prior to reinstatement testing. 


Conditioned Reinforcement Does Not Fully Explain Cue-Induced Re- 
instatement 

Conditioned reinforcement provides a compelling explanation for 
cue-induced reinstatement, but it does not fully explain all aspects 
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Figure 4. Reinstatement relies on the Pavlovian association. (a) In rats trained under standard conditions, where their operant response is paired with both a drug outcome 
anda cue, (b) if instrumental extinction is complemented with separate Pavlovian extinction sessions, (c) their reinstatement response is diminished. 


of the phenomenon. For example, non-contingent cue presen- 
tations at the start of the session have been used to precipitate 
cue-induced reinstatement in mice trained to nosepoke for nico- 
tine [186]. Non-contingent presentation of the cue prior to ex- 
tension of the lever can also promote reinstatement of cocaine- 
seeking and sucrose-seeking in rats [187, 188]. Moreover, non- 
contingent cue presentation is not a redundant reinstatement trig- 
ger because studies of cue-induced reinstatement of alcohol and 
cocaine-seeking have used non-contingent cue presentations in 
cases when animals did not earn their own cue-presentations by 
operant responding early in the session [157, 167, 189]. The time 
course of responding during cue-induced reinstatement also sug- 
gests that there are contributing factors other than conditioned re- 
inforcement. Tunstall and Kearns have reported, for example, that 
approximately half of lever presses during cue-induced reinstate- 
ment for cocaine occurred during the 10 s cue presentation [190]. 
Indeed, if responses during cue presentation are excluded, it ap- 
pears as if rats are barely increasing their responding above extinc- 
tion levels (approximately 20 responses under extinction vs. 30 
responses during reinstatement) [190]. Similar results have been 
found with cue-induced reinstatement of sucrose-seeking, where 
approximately half of responses were during cue presentation or 
time-out [191]. Assuming cue-induced reinstatement is driven 
by conditioned reinforcement, this pattern of responding suggests 
that rats are responding not only to obtain the cue, but because of 
the cue. 

Furthermore, contingent cocaine-paired cues appear to have 
no effect on established instrumental responding, suggesting 
they provide little conditioned reinforcement in many self- 
administration studies. If animals are trained to self-administer 
cocaine in the presence of cues, the removal of these cues does not 
alter responding [192]. Moreover, if extinction is initiated without 
cocaine delivery but with the presentation of cues, rats will rapidly 
extinguish their responding [192] with no significant difference 
when compared with rats that receive extinction of the lever alone 
[164]. If the cocaine-paired cues were indeed acting as conditioned 
reinforcers, as they have previously been shown to support the ac- 
quisition of a new response in sessions across multiple days [180], 
then an operant response that is still paired with the cue should be 
more resistant to extinction than for the lever alone. These results 
indicate that, at least for cocaine, the presence or absence of the 
cue during extinction is not sufficient to maintain responding. 

In the case of nicotine, conditioned reinforcement may make a 
larger contribution to cue-induced reinstatement. Nicotine facili- 
tates the acquisition of conditioned reinforcement [193] and cues 
are important for the acquisition of nicotine self-administration 
[194]. Once self-administration has become established, nicotine- 
paired cues can then maintain responding on their own (i.e. in the 


absence of nicotine), for months [195] demonstrating a powerful 
and persistent conditioned reinforcement effect. Similarly, cue- 
induced reinstatement for nicotine persists across multiple tests, 
although lower doses of nicotine may only support a single rein- 
statement test [196]. These results suggest although conditioned 
reinforcement might not fully explain cue-induced reinstatement 
for several drugs of abuse, its relative contribution varies between 
drugs and may be greater for nicotine. 


Cue-Induced Reinstatement as Pavlovian to Instrumental Transfer 
Cue-induced reinstatement protocols also strongly resemble 
Pavlovian to Instrumental Transfer (PIT) and there is empirical 
evidence that supports a role for PIT rather than conditioned re- 
inforcement alone. Indeed, conditioned reinforcement itself has 
been shown to be mediated by PIT mechanisms in some circum- 
stances [173] and procedures which produce PIT can also produce 
conditioned reinforcement [197]. For example, acquisition of an 
operant response for a conditioned reinforcer can be insensitive to 
outcome devaluation [173, 185], suggesting a general affective or 
excitatory effect analogous to general transfer in PIT (also called 
non-selective PIT). While there may be overlap in the processes in- 
volved in both conditioned reinforcement and PIT, they can be dis- 
tinguished behaviourally — such as in circumstances where a pro- 
cedure produces one but not the other — and neuropharmacologi- 
cal manipulations may also be specific for conditioned reinforce- 
ment or PIT [197]. In PIT paradigms, animals receive separate 
instrumental and Pavlovian conditioning for the same reinforcer 
(Figure 5a-b). At test, animals continue to perform instrumental 
responses, but presentations of the cue modulate the rate of instru- 
mental responding [197-199]. PIT mechanisms would therefore 
explain the pattern of responding observed during cue-induced re- 
instatement, where a very high percentage of responses occur dur- 
ing cue presentation [190, 191]. PIT is most commonly conducted 
using non-drug reinforcers such as food pellets [200], but studies 
using drug reinforcers have also been conducted [201-203]. In hu- 
mans, a nicotine-paired cue potentiated instrumental responding 
more than a food-paired cue [204], demonstrating that PIT may 
vary depending on the reinforcer. As shown in Figure 5c, specific 
PIT involves responding driven by the predictive value of the cue 
via an S-O!-R association while general PIT (Figure 5d) involves 
responding driven by retrieval of the affective value via an [S-O" ]- 
R association [143]. As discussed above, reinstatement is driven by 
S-O associations [162, 164]. However, there remains some ambi- 
guity in whether reinstatement is driven by S-O! or S-OY associa- 
tions. PIT studies may therefore help to clarify this ambiguity. 
The design of cue-induced reinstatement studies is more con- 
sistent with a general or non-selective PIT effect driven by an [S- 
O”]-R association due to its use of a single outcome. There are 
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Figure 5. Pavlovian to Instrumental Transfer (PIT) effects may explain cue-induced reinstatement. During PIT animals receive separate (a) instrumental training and (b) 
Pavlovian conditioning for the same reinforcer. At test, presentation of the conditioned stimulus (S) increases instrumental responding (R) via (c) specific transfer of the 
predictive associations with the perceptual identity of the drug outcome (0’) or (d) general transfer of the affective value (O”) of the drug outcome that has become associated 


with the cue. 


different PIT procedures that can preferentially evoke general and 
specific transfer, with the main variants called non-selective PIT 
and outcome-specific PIT [197, 198]. In non-selective paradigms, 
the Pavlovian conditioning phase involves two stimuli of which 
only one is reinforced and the instrumental phase involves a sin- 
gle lever paired with a single outcome. In outcome-specific trans- 
fer paradigms, the Pavlovian conditioning phase provides a dif- 
ferent reinforcer for each stimulus and the instrumental phase 
similarly trains two levers each paired with their own outcome 
[197, 198, 205, 206]. As reviewed by Cartoni and colleagues, non- 
selective PIT usually produces general transfer rather than specific 
transfer [197], which is thought to be mediated by the general ex- 
citatory or motivational function of the cue [198, 207, 208]. In 
cue-induced reinstatement, the design of the study is most sim- 
ilar to non-selective PIT because, although there are usually two 
levers and only one cue, there is only one outcome. Following Hol- 
land [209], Cartoni and colleagues suggest that general transfer 
tends to be associated with non-specific PIT paradigms because 
of their less detailed representations about the outcome. There- 
fore, it would be expected that cue-induced reinstatement would 
be mediated by a general PIT effect because the outcome represen- 
tations in these designs are singular. Holland also showed that ex- 
tended instrumental training (20 sessions) was more likely to re- 
sult in general transfer than minimal instrumental training (5 ses- 
sions) [209], which would also imply a role for general PIT in cue- 
induced reinstatement since self-administration studies typically 
involve 10 or more days of self-administration [157, 168, 189, 210— 
214]. Bouton and colleagues have recently suggested a role for gen- 
eral PIT in cue-induced reinstatement, noting many common neu- 
robiological substrates between them [28]. 


There is also some experimental evidence that drug-paired 
cues can have outcome-specific effects. Rubio and colleagues 
trained rats to press one lever for cocaine and, on alternate days, to 
press a second lever for heroin [215]. Each drug delivery was paired 
with activation of a cue light above its respective lever. Rats were 
then subjected to standard lever extinction, where levers were in- 
serted but had no programmed consequences. At test, rats re- 
ceived initial non-contingent presentations of either the cocaine 
cue or heroin cue immediately prior to extension of all levers, with 
the lever corresponding to the drug presented at the start of the 
session triggering cue presentations [215]. They found that cues 
specifically reinstated responding on their lever, but that they did 
not trigger reinstatement on the alternative drug lever [215]. Their 
findings are consistent with previous studies of drug-primed rein- 
statement of polydrug use, where animals were trained on both co- 


caine and heroin, but a priming injection only reinstated respond- 
ing on the lever that matched the drug prime [216]. These findings 
do not rule out a role for general PIT because they involve much 
more complex outcome representations, but they do suggest that 
reinstatement may be goal-directed. 

The procedural parallels between cue-induced reinstatement 
and PIT, combined with evidence that suggests a potential goal- 
directed component to reinstatement, suggest that PIT may con- 
tribute to cue-induced reinstatement. However, further studies 
that more precisely examine whether the specific S-O! or general 
affective S-O” associations drive cue-induced reinstatement are 
required. One approach is suggested by Clemens and colleagues 
who combined outcome devaluation with extinction and drug- 
primed reinstatement [217]. In their study, rats received nicotine 
self-administration training followed by outcome devaluation by 
pairing nicotine with lithium injections. Nicotine-primed rein- 
statement was impaired in animals that had received 10, but not 
47 days of self-administration training [217]. If this kind of design 
could be replicated for cue-induced reinstatement, it might pro- 
vide evidence about whether cue-induced reinstatement is driven 
primarily by general PIT or whether it is a goal-directed behaviour. 


Alternative Mechanisms in Cue-Induced Reinstatement 

Alternative explanations for cue-induced reinstatement may arise 
from other associative and non-associative mechanisms. Recent 
work has shown that performance in a PIT paradigm does not dif- 
fer significantly from rats characterised as having an addiction- 
like phenotype, based on motivation in a progressive ratio, persis- 
tent responding during intermittent periods of drug unavailabil- 
ity, and punishment-resistant responding [218]. Although per- 
formance in the PIT paradigm did correlate with performance dur- 
ing cocaine self-administration, it is not clear whether this would 
translate to cue-induced reinstatement [218]. These studies did 
not address whether PIT mediated reinstatement directly, but be- 
cause PIT did not correlate with other addiction-like behaviours, 
it suggests that these behaviours may not be completely driven by 
the associative mechanisms discussed above. 

One possible alternative mechanism is habit learning. Habit 
learning is thought to involve cue-elicited drug-seeking without 
retrieval of drug outcome (0! or OY) memories [143]. While some 
have disputed the importance of habits in drug addiction [219], 
habit formation is commonly thought to support drug addiction 
[217, 220, 221]. However, if habitual responding does not rely on 
retrieval of the drug outcome, then this raises whether it is reacti- 
vating R-O associations like Pavlovian reinstatement approaches 
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are thought to. It also might not be expected that Pavlovian non- 
contingent cue extinction would be effective in reducing habitual 
responding if no drug outcome memories are required. 

Another mechanism that could play a role in cue-induced rein- 
statement is incubation of craving. As discussed above, incubation 
of craving refers to the time-dependent increase in drug-seeking 
after cessation. Evidence from both human [222] and animal [133] 
studies have shown an increase in the degree of cue-induced crav- 
ing or reinstatement after longer periods of abstinence [223]. For 
instance, humans experiencing incubation of craving report crav- 
ing the drug more when exposed to drug-related cues after 35 
days than after 7 days [222]. This increase in craving initially ap- 
pears to be a non-associative mechanism that runs contrary to 
the classical associative learning view that associative strength can 
decay over time [27]. However, there are also plausible associa- 
tive accounts of incubation of craving, such as a loss of reactive 
inhibition [224], weakening of an opponent process that was in- 
hibiting craving [153], or the Kamin effect — a U-shaped mem- 
ory retention curve [225-229]. Further, incubation of craving ap- 
pears to modulate drug memory retrieval because it can be inhib- 
ited with further extinction training. This appears to be effective 
whether animals are given instrumental extinction [230] or Pavlo- 
vian non-contingent cue extinction [168], indicating that retrieval 
of both the R-O and S-O associations may be important in rein- 
statement. Therefore, cue-induced reinstatement may, at least in 
part, rely on alternative mechanisms that modulate retrieval of the 
previously-extinguished R-O association and unextinguished S-O 
association. 


Reinstatement Nomenclature 


As noted above, the terminology of reinstatement differs between 
the addiction neuroscience literature and the behavioural litera- 
ture [28]. This is not unusual in historical terms, as the term re- 
instatement has been variously used to refer to resurgence [67], 
and with respect to cues in a study now considered an antecedent 
of contextual renewal [79]. Reinstatement has also been used 
since the earliest operant relapse models in the addiction neuro- 
science literature emerged in the 1970s and 1980s [119, 120, 123]. 
However, the differing usage of the term reinstatement between 
drug self-administration studies and the generally Pavlovian be- 
havioural literature does need to be recognised [28]. Despite both 
being described as reinstatement, Pavlovian reinstatement and 
cue-induced reinstatement in drug self-administration studies 
are clearly driven by diverse associative mechanisms. Indeed, the 
term reinstatement in the addiction literature has become more 
of an umbrella term, encompassing relapse-like models driven by 
drug priming, stress, cues, and context change. 

The literature contains other examples of such pragmatic 
resolutions of differences in nomenclature. For example, the 
orexin/hypocretin system was simultaneously discovered by two 
research groups via different approaches and given two names — 
orexin and hypocretin [231, 232]. Both terms have neuroanatomic 
or behavioural merit and are widely used resulting in a compro- 
mise on nomenclature — hypocretin is the official gene name and 
pharmacologists use orexins to describe the ligands and recep- 
tors [233]. Corticotropin releasing factor or corticotropin releas- 
ing hormone (CRE/CRH) also has a disputed nomenclature based 
on considerations of molecular structure and hormonal or extra- 
hormonal functions [234, 235]. Like the orexin/hypocretins, one 
term (CRH) became the official nomenclature for geneticists while 
the other (CRF) is used by pharmacologists to describe the protein 
products [234]. In each case it is nowa practical necessity to recog- 
nise both terms because of their widespread usage. It seems that 
a similar compromise is emerging for reinstatement, as it is now 
acknowledged that the term’s usage is different between the addic- 
tion and behavioural literature [28]. 


Conclusions 


Several recovery-from-extinction approaches are currently used 
in addiction neuroscience to model relapse. These include sponta- 
neous recovery, rapid reacquisition, resurgence, renewal, and re- 
instatement. In each case, there are multiple associative learning 
approaches that can elucidate or provide insight into how the op- 
erant response recovers after extinction, with context theory be- 
ing one of the most influential. In most cases, the associative pro- 
cesses in Pavlovian designs and operant drug self-administration 
studies are similar with the exception of cue-induced reinstate- 
ment, where recovery of responding is driven by an ambigu- 
ous process associated with the unextinguished drug-paired cue. 
Since the instrumental response is extinguished with respect to 
the drug outcome, the reinstatement effect is described by some 
as a conditioned reinforcement effect, even though the cue is also 
omitted during extinction. However, examination of the exper- 
imental design suggests it is more akin to reacquisition of con- 
ditioned reinforcement. The pattern of responding during cue- 
induced reinstatement also implies that animals are responding 
because of the cue, in addition to responding for the cue, suggest- 
ing a potential role for Pavlovian to Instrumental Transfer. There 
are also alternative mechanisms, such as incubation of craving, 
that may modulate the retrieval of operant associations, including 
the response-drug outcome (R-O). While the associative processes 
that contribute to cue-induced reinstatement remain ambiguous, 
this ambiguity suggests several additional hypotheses related to 
conditioned reinforcement and PIT in cue-induced reinstatement 
that invite empirical validation. Although reinstatement terminol- 
ogy and experimental procedures differ between associative learn- 
ing and addiction neuroscience, it is clear that associative learning 
mechanisms are highly relevant and informative to understand- 
ing the processes mediating relapse-like behaviours. As scientists 
turn to associative learning models to develop improvements in 
extinction-based therapies for addiction [3, 9, 10], a better under- 
standing of the associative learning that underpins relapse is likely 
to be essential for improving future clinical outcomes. 
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